SQAD: Simple Question Answering Database

نویسندگان

  • Marek Medved
  • Ales Horák
چکیده

In this paper, we present a new free resource for comparable Czech question answering evaluation. The Simple Question Answering Database, SQAD, contains 3301 questions and answers extracted and processed from the Czech Wikipedia. The SQAD database was prepared with the aim of a precision evaluation of automatic question answering systems. Such resource was currently not available for the Czech language. We describe the process of SQAD creation, processing of the texts by automatic tokenization (Unitok) and morphological disambiguation (Desamb) and successive semi-automatic cleaning and post-processing. We also show the results of a first version of Czech question answering system named SBQA (syntax-based question answering).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enlargement of the Czech Question-Answering Dataset to SQAD v2.0

In this paper, we present the second version of Czech question-answering dataset called SQAD v2.0 (Simple Question Answering Database). The new version represents a large extension of our original SQAD database. In the current release, the dataset contains nearly 9,000 question-answer pairs completed with manual annotation of question and answer types. All texts in the dataset (the source docum...

متن کامل

A Net Structure Based Relational Question Answerer: Description and Examples

A question answering system is described which uses a net structure for storage of infor­ mation. The net structure consists of nodes and labelled edges, which represent relations be­ tween the nodes. The labels are also nodes, and therefore definitions of relations may be stored in the net. It is demonstrated that the generality and complexity of this memory struc­ ture allows a surprisingly p...

متن کامل

Question Answering, Semantic Search and Data Service Querying

Question Answering (QA) systems have profoundly evolved since their inception as natural language interfaces to databases. QA technology has indeed become state-of-the-art on open-domain, unstructured information retrieval and more recently touched the Semantic Web and the problem of querying structured data (e.g. RDF triples) on the Web. A natural new challenge is QA over data services, suppor...

متن کامل

How Can Geographic Information Retrieval Benefit from Geovisualization Principles?

This paper proposes to enhance geographic information retrieval interfaces with geovisualization techniques such as multiple representations and rich interactions. It supports its proposition by examples and illustrations of GeoPubMed prototype which was designed to enhance geospatial access to medical citations in PubMed database, the premier database of health publications. Such an enhancemen...

متن کامل

Exploring Deep Learning Models for Machine Comprehension on SQuAD

This paper explores the use of multiple models in performing question answering tasks on the Stanford Question Answering Database. We first implement and share results of a baseline model using bidirectional long short-term memory (BiLSTM) encoding of question and context followed a simple co-attention model [1]. We then report on the use of match-LSTM and Pointer Net which showed marked improv...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014